Teradata: Duplicate Issues

How to find the duplicate record?

The first step is to define your criteria for a duplicate row. Do you need a combination of two or more columns to be unique together, or are you simply searching for duplicates in a single column? In this example, we are searching for duplicates across column on store_no.

select

store_no

,count(*) from sales

group by store_no

having count(*) > 1;

HAVING is important here because unlike WHERE, HAVING filters on aggregate functions.

SELECT store_name,
category,
FROM sales
GROUP BY store_name, category,
HAVING COUNT(store_no) >1;

This query returns only duplicate records—ones that have the same store_name and category:

How to find the duplicate rows?

SELECT a.*
FROM users a
JOIN (SELECT username, email, COUNT(*)
FROM users 
GROUP BY username, email
HAVING count(*) > 1 ) b
ON a.username = b.username
AND a.email = b.email
ORDER BY a.email 
 
Duplicate records to be found?


Select empid, ename, sal,count(*) from emp group by empid, ename,sal having count(*) >1;




Delete the duplicate records in the table?


DELETE FROM SALES WHERE STORE_NO IN (SELECT STORE_NO FROM SALES WHERE STORE_NO=10)




Delete from Ttab where rowid not in ( select max(rowid) from Ttab group by empid, ename, sal       having count(*) >1)
 




Delete the duplicate records in the table?
 




Delete the duplicate records in the table?
DELETE FROM [SampleDB].[dbo].[Employee]
    WHERE ID NOT IN
    (
        SELECT MAX(ID) AS MaxRecordID
        FROM [SampleDB].[dbo].[Employee]
        GROUP BY [FirstName], 
                 [LastName], 
                 [Country]
    );
 
WITH CTE([firstname], 
    [lastname], 
    [country], 
    duplicatecount)
AS (SELECT [firstname], 
           [lastname], 
           [country], 
           ROW_NUMBER() OVER(PARTITION BY [firstname], 
                                          [lastname], 
                                          [country]
           ORDER BY id) AS DuplicateCount
    FROM [SampleDB].[dbo].[employee])
SELECT *
FROM CTE;
 
SELECT E.ID, 
    E.firstname, 
    E.lastname, 
    E.country, 
    T.rank
FROM [SampleDB].[dbo].[Employee] E
  INNER JOIN
(
 SELECT *, 
        RANK() OVER(PARTITION BY firstname, 
                                 lastname, 
                                 country
        ORDER BY id) rank
 FROM [SampleDB].[dbo].[Employee]
) T ON E.ID = t.ID;
 
DELETE E
    FROM [SampleDB].[dbo].[Employee] E
         INNER JOIN
    (
        SELECT *, 
               RANK() OVER(PARTITION BY firstname, 
                                        lastname, 
                                        country
               ORDER BY id) rank
        FROM [SampleDB].[dbo].[Employee]
    ) T ON E.ID = t.ID
    WHERE rank > 1;
 

Teradata

Duplicate Issues

No comments:

Post a Comment